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Abstract 


We have investigated the problem of reconstructing the ECG signal, which is corrupted 
with additive noise The noise appears when the signal passes through a disperse e chan- 
nel such as in long distance medical telemetry system We have used neural network for 
suppression of noise Polynomial perceptron and fractionally spaced recursive polynomial 
perceptron are used as nonlinear classifiers to reconstruct the ECG signals The behav lour 
of backpropagation algorithm and complex backpropagation algorithm is investigated It 
is shown that the both algorithms are powerful, but faces some practical difficulties The 
selection of learning parameters number of hidden lasers and number of nodes in each 
hidden layer are experimental Polynomial perceptron is the alternative nonlinear ar- 
chitecture to approximate the optimal equalizer solution The behaviour and nonlinear 
mapping ability of polynomial perceptron and fractionally spaced bilinear perceptron are 
investigated It is shown that these techniques can approximate any continuous function 
within a specified accuracy The manner in which these neural network algorithms can be 
utilized is described. The complex neuron structure is used to modify the above methods 
for complex input sequences The performances of these two methods are compared and 
their relative features and limitations are discussed The applications of these methods to 
16-level quadrature amplitude modulation are considered for reconstruction of ECG sig- 
nals Simulation results suggest that the fractionally spaced bilinear perceptron netvvoik 
recover the ECG signal m the high noise environment 
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Chapter 1 


Introduction 


There are man} expert sy stems which have been de\ eloped m the last few' tears m an 
attempt to sohe medical diagnostic problems automatically We hate investigated that 
the artificial neural networks mat be a better solution for reconstructing the ECG (Elec- 
trocardiogram) signals which are corrupted with noise and other interference m long 
distance telemetry 

In recent years neural networks have been proposed as solutions to a tarietv of prob- 
lems m the areas such as signal processing including image processing pattern recognition 
identification, prediction and control of dynamical sy stems The pattern recognition by 
the neural networks are useful in man} practical problems such as medical diagnosis, 
speech recognition and adaptrve control The purpose of any attempt at pattern recog- 
nition is the identification of the underly mg characteristics w'hich are common to a class 
of objects The correct identification of the underlying characteristics enables one to ex- 
trapolate, 1 e, to identify a new object as belonging to a certain class on the basis of 
its underlying characteristics and m spite of the ■variations of mcidential characteristics 
within the class 


1.1 Medical telemetry 

Medical telemetry systems transmit signals continuously or at recurrent intervals and 
measure the \anables which are displayed recorded, or used to activate a control mecli- 
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anism. 


Medical telemetry may be a combination of wired and wireless For example a wire- 
less system may send data from the patient to the hospital console Then at the console 
data mat be tnped over the telephone lines In wireless telemetry, the carrier is a sig- 
nal which caines the data through the air The data themsehes modulate this carrier 
Modulation refers to a nonlinear operation in which message data are used to mo dm 
some characteristic of the carrier The carrier is usually sinusoidal or periodic sequence 
of pulses Some times multiplexing is also preferred with telemetry systems to transmit 
the data from different patient transducers simultaneously 

The most urgent application of medical telemetry is the immediate monitoring of an 
unstable coronary condition upon the onset of distress symptoms A specialist m the 
city or on the other side of the earth can make the immediate assessment of the onset oi 
nwocaidial infarction Most myocardial-mfarction patients experience either brady cardia 
tachycardia heart block or excessive premature ventricular contractions before reaching 
the hospital These arrythnnas precede cardiac arrest, killing 60 percent of heart attack 
victims before reach the hospital 

When a patient, hav mg a pacemaker installed has cardiovascular surgery or treatment 
or has a chronic problem, there is often a need for frequent outpatient EC'G data retrie\al 
Normally this application requires that ECG data be transmitted from the patient's home 
to the doctors’s office or to the hospital This is an ideal application of medical telemetry 
All wired and wireless systems need both transmitter and receiver In multiplexing, 
several signals are placed on one carrier Signal conditioners convert and amplify the 
data and match the incoming signal to the multiplexer or transmitter A block diagram 
of typical single channel system is shown m fig 11 The major difficulty which arises 
m biotelemetry is the interference with other systems. Therefore, after demodulation, 
channel equalizer is required to suppress the noise and interference w'hich gets added 
during transmission of data. The neural network may used as the channel equalizer to 
suppiess the noise and interference 

We have focussed our attention on the part of channel equalization using neural net- 
w orks 
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Fig 1 1 Typical single channel telemetry system 













1.2 Artificial neural network 


The most important feature of artificial neural network is that they perform a large 
number of numerical operations m parallel These operations involve simple anthmatic 
operations as well as nonlinear mapping and computation of derivatives The compu- 
tational elements used m the neural networks are nonlinear and typically analog, these 
elements are connected via weights that are tvpicalh adapted during use to improve the 
performance Almost all data stored in the network are involved m recall computation 
at any given time The distributed neural processing is typically performed within the 
entire array composed of neurons and weights This indicates that parallel distributed 
computing makes efficient use of stored data (weights) and of the input data. Therefore, 
we can resemble the artificial neural network similar to human brain performance m two 
respects 

• The knowledge is acquired by the network through the learning process. 

• The interneuron connection strengths known as synaptic weights are used to store 
the knowledge 

The knowledge refers to stored information or models used to interpret predict and 
approximately respond to the output. The adaptation is a major focus on neural network, 
which provides a degree of robustness by compensating for minor variabilities in charac- 
teristics of processing elements Therefore, the modification of synaptic weights provides 
the traditional method for the design of the neural networks. The majority of the works 
on the neural networks is directed towards a better architecture for pattern classification 

1.3 Review 

The number of networks whose architectures are found to be useful for pattern classifica- 
tion have been proposed by the several authors Here, we discuss briefly about the various 
architectures 

In 1943, for the first time, McCulloch and Pitt proposed the standard architecture [17] 
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[18] of neuron as follows 



where x, are the the inputs to neuron xo is taken as the threshold input w t are the 
synaptic weights and y is the output of the neuron The function f( ) is nonlinear which 
can be either hard limiter, threshold logic, sigmoid or hyperbolic tangent type function 
There is no a prion reason to assume that the synaptic weights are constants This is the 
first approximation to the complex behavior of a neuron introduced by McCulloch and 
Pitt 

Williams- Zipser [6] proposed the architecture which was not purely feedforward This 
architecture permits any neuron in the network mat be connected to any other neuron m 
the network 

Jordan- Elman [5] proposed the architecture which has the extra later of neurons that 
copy the current actuations m the hidden layer neurons and after delating these values 
foi one time instant, feedback them as additional inputs into the hidden neurons 

Lapedes-Farber [3] proposed an architecture w'hich is defined as follows 

V = (1 2 ) 

where the activation function a(t) is the linear combination of the input to the neuron 
Output is a nonlinear function of the activation function The architecture is showm m 
fig 12 If the input derived from the previous layer only, then there is no feedback 
around the neuron and the architecture is feedforward only Alternatively, there are 
various possibilities for placing the feedback paths within network When the feedback is 
incorporated as local activation feedback as shown in fig 1 3, the input x t = a(t — z). i = 
12, n is delayed version of the activation or the combination of delayed version of the 
activation and the delayed version of the input variables When each synapse incorporate 
a feedback structure, the local synapse feedback architecture results, as shown m fig 1 4 
This feedback architecture may be m the form of the IIR or FIR filter and each synapse 
function may be modified by a linear transfer function When the feedback is around 
the nonhneai function as showm m fig 1 5, then the input will be the delated veision ol 
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output values 1 e, x t = y(t — 2 ) This tvpe of feedback architecture is known as the local 
feedback architecture This architecture has the feedback after the nonlinear function 
while the local activation feedback and local synapse feedback have the feedback before 
the the nonlinear function, therefore the behaviour of this architecture is different from 
other two architecture 

Back-Tsoi [5] proposed the locally recursive global feedforward network with local 
synapse feedback which are called as IIR synapse m'Jii’aver perceptron Back-Tsoi ar- 
chitecture is similar to local synapse architecture as shown m fig 1 4, where the sv naptic 
weights are replaced bv the linear transfer function with poles and zeroes This is a 
modified version of the McCulloch-Pitt neuron The architecture is follows 




» = 1,2, 


n 




w=o 


where, 


y—'m- , 

(14) 

is a linear transfer and z~'x(t) is defined as x{t — z), If feedback is taken from the previous 
layer, then it is called as local synapse feedback and if it is derived from the output y(t), 
it is called as global output feedback 

Frascom- Gon- Soda [7] [5] proposed two architectures namely local activation feedback 
architecture and locally recurrent globally feedforward architecture with a feedback path 
being taken from the output of the hidden layer neuron The generalized architecture of 
Fracscom-Gori-Soda is shown m fig 1 6, where, D is the unit time dela;y This architecture 
is different from the Back-Tsoi architecture in that Back-Tsoi have taken the local feedback 
before nonlinearity, while Frascom-Gori-Soda has taken the local feedback after it has 
passed through the nonlinearity This architecture is defined as follows 


( m n \ 

53 *#(*-») + 51 “>»**(*) I 0 

1=1 1=0 / 

where, k t are the constants When the feedback is taken from the output of the same 
neuron, it is called as local feedback and m the case when feedback is taken from the 
other neurons, global recurrent networks results. 
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synapse 









D \ries and Prmcipe [5] proposed the followr g architecture 
du(t) ft 

~j t = -au(t) +g 0 y(t) + J o g(t- s)y(s)ds 4- xit) (16) 

where u(t) is the activation state with y(t) = f(u(t )}, go is the gam constant controlling 
the dynamics of the system and g[t) is the convolution kernel In this architecture the 
synapse is modeled by the convolution operator They also proposed a discrete time 
model, where time delay is modelled by the operator z~ l and ~ -1 x( t ) is x\t — i) 

Poddar-Unmkrishnan [5] proposed the architecture with neurons that can have mem- 
ory, to store information regarding past activations of network neurons The architecture 
as shown m fig 17 is defined as follows 

y(t) = f [Y, w t x ‘(t) + (1-7) 

\j=0 i=0 / 

The output of the memory neuron from previous layer is defined as follows 

~«00 = ocy(t - 1) + (1 - a\z,{t - 1) (18) 

where, a is the constant The memory neuron remembers the past output values to that 
particular neuron In this case, the memory is taken to be in the form of an exponential 
filter. The Poddar-Unmkrishnan architecture can be c siderd as a special case of the 
generalised Frascom-Gori-Soda architecture 

The above architectures may be considered as the generalization of the classical 
McCulloch-Pitts neurons from input output point of view Instead of constant sj nap- 
tic weights, the architectures have the synaptic weight that are modelled by the linear 
transfer function The feedback from the output is also modelled by the linear transfer 
function Therefore, these generalized neuron can then be used with multilayer percep- 
tron 

There are, however, some of major disadvantages associated with this highly nonlinear 
structure that restrict the multilayer perceptron for practical application The selection 
of architectures and parameters for multilayer perceptron is mainly by experiment, and 
its structure is complicated The convergence rate of multilayer perceptron is rather slow, 
and its learning time is typically very long 
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There is another alternative nonlinear technique known as polynomial percept ron [8] 
[9] M [10] [11], which is theoretically more tractaoie compared with the rr.ultdayer per- 
ceptron In this technique the polynomial approximation method is employed as a means 
of approximately realising the optimal solution It is capable of forming arbitrary com- 
plex nonlinear decision surface to approximate any continuous function within arbitrary 
specified accuracy, which is similar to the multilayer perceptron 

D F Specht [8] first time proposed the idea for pattern classification using poly nomial 
functions Specht referred to his network as the poh nomial discriminant method and 
show r ed that the polynomial discriminant method is simple for determining the weight 
coefficients for cross product and power terms in variable input The method was based 
on the nonparametric estimation of a probability density function for each category to be 
classified and the bayes decision rule was used for the classification Because polynomials 
can represent functions that are far more complex than any linear function, the polynomial 
discriminant function is not restricted to linear seperable problems Specht applied this 
method successfully to the problem of automatic analy sis of ECG signals 

D F Specht [9] present the review of two methods namely probablistic neural network 
and polynomial adaline for classification based on the bayes strategy and nonparametric 
estimators for probabilty density function The perform ices of these two methods are 
compared with multilayer perceptron and the relative advantages and disadvantages are 
also discussed in this paper 

Chen-Gibson-Cown [4] investigated the application of simple nonlinear polynomial 
perceptron structure and showed that approximate realisation of the optima? equalisation 
solution can be implemented using a polynormal perceptron 

Xiang-Bi-Le-Ngoc [11] proposed the idea of fractionally spaced recursive polynomial 
perceptron and gave the architecture of bilinear perceptron, which is based on the model 
of bilinear systems [13] The bilinear systems are linear seperately with respect to the 
state (input) and the control (output), but nonlinear jointly. Since, the nonlinearity is due 
to the product of input and output, this bilinear perceptron is considered as the simplified 
architecture of the class of fractionally spaced recursive polynomial perceptron, where the 
input signal vector have feedback sequences, input sequences and their product terms 
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We have examined and investigated the application of the poly r orr.:aI perception 
and fractionally spaced recursi\e polynomial perception architectures to the problems m 
adaptive signal processing and show that these network architectures may be applied as 
a solution to noise compression in ECG signal 


1.4 Back propagation 

The back propagation is a learning law, the term is often used to refer to a hierarchi- 
cal network architecture that uses back propagation updated algorithm for adapting the 
weights of each layer based on the error present at the network output The algorithm 
is a generalization of least mean square algorithm In its simplest structure, back propa- 
gation uses stochastic gradient search technique for the minimization of cost function It 
requires continuous differentiable non-lmear function such as hard limiter function, sig- 
moidal function etc Some times when faster convergence rate is required, a momentum 
term is added to make the weight changes smooth 


1.5 Organisation of Thesis 

The Thesis is organised as follows 

In present chapter 1 , we present brief introduction of medical telemetry, artificial 
neural networks, review of various architectures and back propagation 

In chapter 2 , we present the architecture of complex neuron and the corresponding 
algorithm for the implementation of complex mapping from input to output. This complex 
structure of neuron is used in subsequent chapter for the complex domain mapping of input 
and output of polynomial perceptron architecture 

In chapter 3 \ we present the mathematical concept of polynomial perceptron and 
fractionally spaced polynomial perceptron with their basic properties, features and limi- 
tations The subsequent section of this chapter cover the type of activation function and 
their need for implementation The last section of this chapter includes the algorithms 
which are used for the implementation of two dimensional complex domain representation 
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of polynomial perception and fractionally spaced polynomial perception 

In chapter 4, we present the simulation results obtained by the & f back 

propagation algorithm, complex back propagation algorithm, polynomial perceptron and 
fractionally spaced bilinear perceptron We ha\e applied the polynomial perceptron and 
fractionally spaced bilinear perceptron to EC'G signal for noise suppers on using 16-le\el 
quadrature amplitude modulation systems 
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Chapter 2 


Complex neuron structure 

2.1 Introduction 

In this chapter, we present the concept of complex neuron architecture and corresponding 
complex back propagation algorithm [17] It has been successfully applied in neural 
networks for classification and prediction It is generalized to include complex signals and 
coefficients for many nonlinear signal processing problems 

We have used this nonlinear adaptive algorithm to our polynomial perceptrou and 
fractionally spaced bilinear perceptron model, which require a complex mathematical 
representation of signal 

2.2 Architecture 

The basic element of multilayer [2] neural network is the neuron The multilayer percep- 
tron state that the jth neuron m the mth layer has its primary local connections and 
is characterised by a set of real weights and real threshold. Then the output of the jth 
neuron may be written as : 

s ,'” 1 = / (£ + «S”') u*> 

where, W, (m) and 6 ( ™ ] are the weights coefficient vector and the threshold for jth neuron 
of mth layer. 
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/( ) is the nonlinear sigmoid type tangent hyperbolic actuation function, which could 
be written as 

1 _ e -ix 

f(x) = tanh(~fx 12) — (2 2 t 

1 — e 1X 

This neuron architecture can be extended to complex domain as shown in fig 2 1 
Similar to the the multilayer perceptron the complex output of the jth neuron m the mth 
layer can be written as follows 


- { m) (m) , (m) 

y 3 = Vjr -jy)i 


= f [t K*' + Ki 1 ) {y [ T 1] - -f {<*%' + rf?') 

= / ( (e «* »&"’ - w 

+j(i (H ’ilVIr" - ir'jT’yJfl'") + «!?' 


(2 h 


U=1 


where, /(j + jy) is a complex mapping function, w T hich can be defined as follows 

1 _ e ~^ 1 - e~' ,y 

fu+jy) = + 

= f(x)+jf(y) 


(24) 


The 7 is the slope parameter and the sign * represents that the variable is complex 

The structure of this complex neural architee4ure is similar to that of the multilayer 
perceptron It also consists of se\eral hidden layers and is capable of performing complex 
mapping between input and output lay-er The difference is only that the neurons and the 
corresponding learning algorithms are m complex domain 


2.3 Algorithm 

This algorithm performs an approximation to the global minimization achieved by the 
method of the smoothed stochastic gradient Here, nonlinear sigmoid type function 
shown by equation (2 4), is used as the activation function The first order derivation of 
this complex function may be defined as follows 


16 



/'(* + Jif) = /'(*)+ j/M 

= ^( i -/ 2 W )+ 4 (>-- f 2 W ) 12 5 > 

The error signal (e,). required for adaption, is defined as the difference between the desired 
response and the output of the perceptron 

e t (t) = d,(t) - t/ t (f | i = l,2 ,\ n 12 b) 

where, d t is the desired response at the ?th node of the output laver Hence, the sum of 

error squares produced by the network is as follows 

v m 

E(t) = Y,tr(iK(t) (2 7 ) 

1 = 1 

Here, we have used following steps for the implementation of this training algorithm 
with nonlinear sigmoid type complex activation function . 

Step 1. Initially assign the small complex random values to all weight coefficients and 
node offsets 


Step 2. Present input vector and desired response for all traing patterns 

Step 3. By using the equation (2 4), calculate the outputs y 1? y 2 -> , y v m for all patterns. 

N m are the number of nodes in the mth layer 

Step 4. Adapt the weight coefficients as follows 


A tJ (t + 1) = aA tJ (t) + r)8,x* (2 8) 

w tJ (t + 1) = + A tJ (t + 1) (2 9) 


where a, and r) are the momentum constant and training rate, j! is the conjugate 
of the output of node j or input to node i and 


s,= 


< 


( d t -y,)r 

f* Ek w*k 


for output layer 

for input and hidden layers 


where, sign * represents the complex conjugate term 


(2 10 ) 
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Step 5 Repeat b\ going to step 2 

The generalization of the back propagation to deal with complex signals make it po 
sible to use this powerful nonlinear signal processing algorithm 
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Chapter 3 


Polynomial perceptron 


3.1 Introduction 

The polynomial systems [10] are those nonlinear systems whose output signals ran be 
related to input signals through a truncated volterra series expansions The volterra 
series expansion can model a large class of nonlinear systems This is the basic idea 
behind the development of polynomial perceptron, which is able to approximate a large 
class of nonlinear signals within the specified accuracy 

In this chapter we will discuss the architecture of polynomial perceptron and fraction- 
ally spaced recursive polynomial perceptron [11] in detail with their nonlinear mapping 
ability, features and limitations The polynomial function with sufficient size of degree can 
approximate any continuous function within a specified accuracy. It is shown here, that 
nonlinear mapping ability of polynomial perceptron is similar to multilayer perceptron 
with one hidden layer The performance of fractionally spaced recursive polynomial per- 
ceptron is better than polynomial perceptron with low complexity and fast convergence 
rate 

The type of activation function, with their needs and the training algorithm used 
for the implementation polynomial perceptron as well as fractionally spaced recursive 
polynomial perceptron, are discussed in section 4 and section 5 
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3.2 Architecture of polynomial perceptron 


The use of polynomial function to approximate a continuous function is an old but ef- 
fective technique and is widely applied to the identification of nonlinear systems Any 
continuous function can be approximated to within an arbitrary accuracy by a polynomial 
function with a sufficient size If the polynomial function is directly used to approximate 
a nonlinear function, the degree of function has to be sufficiently high to achieve the 
adequate accuracy Therefore, complexity of polynomial perceptron is determined by the 
two parameters, namely the order (size of input vectorj and the polynomial degree. 


3.2.1 Properties 

The architecture of polynomial perceptron with input order m — 3 and degree of polyno- 
mial 1 = 2 is shown m fig 3 1 

Following polynomial decision function can be used as an approximate realisation of 
decision function 

m 

Pw(X( t)) = W 0 + w n x (t - *i + !) 

u=i 

771 771 

+ H W lll2 x(t -1-1 + l)x (t - 12 + 1 ) 

*1=1 *2=*1 

mm 77i 

+ • +E E - Z 1 + l ) 

*1=1 *2=*1 *Z=*Z-1 

x(t - z 2 + 1) x(t - u + 1) .(3.1) 

where, 

X(t) = [x{t), x(t - 1), , x(t - m + l)] r (3.2) 

is the m dimensmal input signal vector, and 

W — [t£>0, U>1,U>2,-. ,W m ,W Ui Wi2, --,w n ,W mm m ] T (3.3) 

is the Tik dimensional coefficient vector may be obtained by the following equation 

= n t _x(m + i - l)/i, i = 1,2, .,1 (3.4) 

t=0 
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Fig 3 





J 


1 Architecture of polynomial perceptron 



saa&t •‘A* ■ : 
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The mapping ability of polynomial perceptron can be shown by using the Stone- 
Weierstrass Theorem [12], which establishes the necessary and sufficient condition that a 
polynomial function can approximate any continuous real \alued function 
The equation (3 1) can be written as 

P»(X(t)) = £,p' n (Xlt)) ( } 5 | 

j=0 

where, 

P% = Wo 

m 

p l n(X(t)) = J3ur tl x(t-t i + l) 

u=o 

mm m 

p[y(X(t)) = E E X/ W WI ~ 2 1~1) 

1 1— l 12 — ii U = 

x (t — i 2 + 1) x(# — It + 1) 

The equation (3 5) satisfies the following properties of Stone-Weierstress Theorem 

• Identity Function It is obvious that the set of all polynomial functions {f{Pw(X(t))}. 

i = 0,1, , contains constant functions Pw(X(t)) — w 0 

• Separability • Take two distinct points X{t) and X'(t), we may assume that x(t) ± 

x'(t), then a degree-1 polynomial function p]y(X(t)) with a set of coefficients wq — 0, 
wi / 0, w, = 0,i = 2,3, , m satisfies p\ v (X(t)) ± p\ v (X'(t)). 

• Algebnc closure The sum and product of two polynomial functions 1% and P{i_ 
are also the the polynomial functions The degree of polynomial for sum is the 
max{l\,li} and the degree of polynomial for product is (l t + l 2 ) For example, if 
l 2 > l\ then, 

+ = E(^ + pW a )W0) 

V 1=0 

+ XI Pm (X(t)) (^b) 

( i+i 
li+h 

(/>& X Pw 9 )W*)) = E PwiX(t)) (3.7) 
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where, 


{pw j + Pw 2 ) (X(t)) 
P%(X(t)) 
{pw ! + 1^) {X(t)) 


Wi o -T lt' 2 ,o 

w-1 0W2 0 

m 

E { m,n + v-Xn ) x(t ~ h + 1 ) 
11=1 


Puf-Y(i)) = +«J lll ii’jo)-r(i-»i + 1) 


*i=i 


{p‘w,+p‘? H )(X(t)) = 


m m 


p[\+ h (X(t)) 


x(t - i h + 1) 


EE E 

*i=i i2=n ii 1 =n 1 -i 

4* ^2,tit 2 ifj ) 

x(t ~~ l\ + l)^(i — 2 2 “f" 1 ) 

mm m 

EE- E 

U=1 z 2 =n if 1+i2 =:f 1T i 2 _i 

(^l,t!Z 2 n 2 w 2, ii 2 +i *i 1+ * 2 ) 

x(f - zj + l)x(f - 12 + 1) x(f - n l+ i 2 + 1) 


It is clear, from above properties that the set of all polynomial functions is dense in real 
banach space of continuous functions on X In other words, polynomial function is sat- 
isfying the Stone-Weierstrass theorem Therefore, polynomial function can approximate 
any continuous function within the specified accuracy 

Since we can expand the sigmoidal function f(x) by the Taylor series expansion as 
follows 

f(x) = = ko + E k 2t+ ix 2l+1 (3 8) 

* i e t=0 

where, 

. i 
*» = 2 

1 * 1 — IV 

*, = - 0»> 
Therefore, the output y(t) can be written as follows 

y(t ) = f(Pw(m ) ) 
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- k Q + k 2l+ l (P(y (X (t))J 

5=0 

oo oc 

= EE EX .X 1 -# j+ 

t 1 =0* 2 =G 

where, k denotes the coefficients and the equat on (3 10) is nothing but the well known 
\olterra series Now we can show that the mapping ability of poly normal perceptron is 
similar to the multilayer perceptron with one hidden la\er 

Consider a three layer neural network with one hidden layer ha\mg two input signals 
1 1 an< ^ x 2 Where fi ZJ is the weight from zth input signal to jt h neuron of the input layer 
is the weight from jth neuron of input layer to the kth neuron of the hidden layer and 
e h is the weight from kth neuron of hidden layer to the neuron of output layer 

The output of the jth neuron of the input layer, the output of kth neuron of the 
hidden layer and the o\erall output of the neural network may be written as follows 

°j 1} = /(lX x j = M (ill) 

°k ] = * = 1,2 (3 12 ) 

m = f(±e k O 1 2) ) 

= / ^E e */ ( 31J ) 

Therefore, by using the equation (3 8) the above output equations may be rewritten as 
follows 


op — ko + E £ 2 i+i(M 11 j 1 + f J '2l x 2) 2l+X 

t=0 

(3 14) 

oo 

0 2 1} = ko + E £ 21 + 1 (^ 12^1 + /^22X 2 ) 2,+I 

1=0 

(3 15) 

00 

0[ 2 ^ = kp + E £ 21+1 (^nOi 1 ^ T V2iO i 2 ] )-' +l 
«=o 

(.3 lb) 

i 

+ 

O 

11 

W cs 

0 

(.3 17) 

O = e x O {2) + e 2 0\ 1] 

(.3 18) 
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The equation (3 18) ma) be written as follows 

0 = (3i9) 

1=0 J =0 

where, K is the coefficient The equation f‘3 19) is also the \olterra series 

Hence, it is clear that nonlinear mapping ability of polynomial perceptron is identical 
to the mapping ability of three layer perception with one hidden layer 

It is seen that the number of weight coefficients increases exponentially in the lmple- 
mantation of polynomial perceptron Therefore, the polynomial perceptron is computa- 
tionally more complex than simple linear structure Increasing computation complexity 
and dimensionality is a common price for employ mg a nonlinear architecture However the 
structure and operation of polynomial perceptron is simpler than multilayer perceptron 

3.2.2 Features 

The polynomial perceptron possesses following important features 

• The polynomial perceptron provides a simple method of determining weights for 
cross product terms and pow'er terms 

• The algorithm adjusts the weight coefficients of the polynomial on the basis of one 
pattern at a time 

• Since weight coefficients are adjusted after each pattern, the polynomial perceptron 
is able to use new information as it becomes available 

• The learning procedure is not iterative, learning is complete after each pattern has 
been observed only once This feature makes storage of training pattern unnecessary 

• With minor modification to the basic adapt algorithm, the polynomial perceptron 
can disregard old data and therefore follow nonstationary statistics 

• The shape of the decision surfaces can be made as complex as necessary, or as simple 
as desired 
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• The computational and storage requirements of poy normal perceptron increase only 
linearly with number of weight coefficients used 

• The number of weight coefficients used can approach or even exceed the number of 
training patterns with no danger of the polynomial overfittmg the data 

3.2.3 Limitations 

• The major limitation of polynomial percptron is that the number of weight coeffi- 
cients increase exponentially with polynomial degree (/) 

• Another limitation associated with the polynomial perceptron is that any noise 
m the input signal to the adaptive filter will appear m the multiplicative fashion 
and will affect the performance Because of the cross product terms and higher 
power terms, the noise m input data gets amplified This will affect the process of 
convergence and the network will not be able to learn the input patterns if noise 
level is very high 

3.3 Architecture of fractionally spaced recursive poly- 
nomial perceptron 

Feedback concept is very useful in modeling of many nonlinear systems The bilinear sys- 
tems are one of them which utilize the feedback concept, where the output sequences are 
also used as a part of input signal vectors The simple architecture of fractionally spai ed 
recursive polynomial perceptron is basically the bilinear architecture. This architecture 
overcomes the drawbacks of polynomial perceptron It requires small number of weight 
coefficients as compared to polynomial perceptron and is equally capable of approximating 
any continuous function within the specified accuracy 
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3.3.1 Properties 

The input-output relation for this architecture may be shown by the following equation 

m = f(fL(ziin) 

= 13 20 ) 

where, is a degree-1 polynomial function with weight coefficient vector W and 

Z(t) = [y(t -T 0 ).y(t — 2T 0 ), y(t - (n - 1 )T 0 ), 

x(t), x{t - r 0 ), x(t - (m — l)r 0 )] T (121) 

is the (m + n — 1 )-dimensional input signal vector with To < Tq The architecture consist of 
asynchronous feedforward filters and a synchronous feedback filters Variables To and Tq 
are tap spacing Variables m and n are the tap numbers of feedforward part and feedback 
part respectively, and p' w is the zth order polynomial function Since the architecture 
of this fractionally spaced recursive polynomial perceptron is similar to the architecture 
of polynomial perceptron, therefore, w r e can show that this architecture is also capable 
of approximating any continuous function withm an arbitrary accuracy like polynomial 
perceptron as shown in previous section 

Inspite of its simplicity, it is an important nonlinear architecture which can model 
the large class of nonlinear systems with few number of weight coefficients If the input 
signal vector space is compact, then, the set of all fractionally spaced recursive polynomial 
perceptrons {f(P{ v (Z(t)))} is dense m real banach space of continuous functions on Z 

3.3.2 Fractionally spaced bilinear perceptron 

The systems which are linear with respect to the input and output seperately, but non- 
linear due to the products between input and output are known as bilinear systems [13]. 
The architecture of the bilinear system is shown m fig 3 2 
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Fig. 3 2 Architecture of bilinear system. 
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Fig 3 3 ■ Architecture of fractionally spaced 
bilinear perceptron. 
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There are several theoretical and practical motivations for the importance attached 
to the class of bilinear sv stems This allows the application of a number of techniques 
and analytical procedures already set up for the linear sys^m. s The nonlinear structure 
of bilinear systems offers some important advantages over the linear case. Since the 
nonlinearity in the bilinear system is due to the product term the architecture of bilinear 
systems may be classified in the catagory of fractionally spaced recursive polynomial 
perceptron as fractional!} spaced bilinear perceptron (FSBLP) 

Fractionally spaced bilinear perceptron is the simplified architecture of fractionally 
spaced pol} normal perceptron class The weight coefficient vector for fractionally «pa<eri 
bilinear perceptron may be written as follows 

W = [<Jo, oq, i a m - 1 c lid> ' c 7i-l ^01 1 ^02- ' <3 22) 

Therefore, first and second degree polynomial functions may be written as follows 

TTi—l n — 1 

PwiZiO) = Y a ' x (t ~ 0 + E - 0 

z=0 1=1 

m—1 n—1 

Pw(Z(*)) = E EM (* - *h/(f - j ) 

t=0 j = 1 

Then, the simplified output equation may be written as follows 

( m — 1 n—1 

E a t x(t - i) -f Y Wit - l ) 

t=0 i=l 

m — 1 n—1 \ 

+ E Y, b vx(t-i)y(t-j)\ ( 3 23) 

»=o j=l / 

where /( ) is the nonlinear sigmoid t}pe activation function The Taylor series expansion 
of sigmoid function is given by equation (3 8) 

The architecture of fractionally spaced bilinear perceptron wuth feedforward filter taps 
(m = 3) and feedback filter taps ( n = 3) is shown in fig 3 3 The input to the feedforward 
filter is the received sequence and input to the feedback filter is the recovered sequence 

3.3.3 Features 

• This architecture possesses almost all the features of polynomial perceptron as shown 
above 
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• It requires smaller number of weight coefficients, which reduces the comn c'-ed 
complexity and storage requirements 

3.3.4 Limitations 

• The obvious limitation of this representation is that it must continuously monitor 
for stability 

• \nother limitation associated with fractionally spaced recursive polynomial percep- 
tron is that any noise in the input signal to the adaptive filter s.\ ill appear m the 
multiplicative fashion and will affect the performance Because of the cioss product 
terms, the noise in input data gets amplified This will affect the process of conver- 
gence and the network will not be able to learn the input patterns if noise level is 
very high 

3.4 Activation Function 

The activation function used for the implementation of polynomial perceptron and frac- 
tionally spaced recursive polynomial perceptron is the nonlinear function of the sigmoid 
type (tan hyperbolic function) as shown by the following equation 

/M = 


This particular choice of the sigmoid function is because of the bipolar nature of the 
transmitted signal x(t). Here 7 is slope parameter The value of 7 is normally selected 
as unity, but when the polynomial function is directly used to approximate the nonlinear 
function the value of 7 is kept less than 1 

Although, it is seen that the sigmoid function does complicate the mean square er- 
ror (MSE) surface, yet the classifier will converge to the correct solution when initial 
parameters of training algorithm are inside a certain sphere. 


tanh(-yx/2) 
1 - e~^ x 


1 + e~'* x 

2 

1 + e~' yx 


1 , 


7 > 0 


(3 24 ) 
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The real criterion is the classification accuracy and the mean square error criterion is 
only for training to obtain an acceptable level of misclassi ficat ion -\n alternatne to do 
this is to increase the polynomial degree sufficient!}, which increases the size of weight 
coefficient vector exponentially and increases the complexity Therefore by introducing 
the nonlinear sigmoid type actuation function (tanh function), the polymomial degree can 
be restricted from 3 to 5, which w r ould enable it to approximate the optimal solution 
Since, we have considered the two dimensional classification problem therefore the 
neurons of the network, input sequences and output sequences are replaced by the their 
equivalent complex domain representation The sigmoid function is replaced by the com- 
plex mapping sigmoid function 

By using the equation (2 4) the real and imaginary part may be defined as follows 


/(u) = 

/(») = 


1 


1 + e -T “ 

2 

1 + t~' iU 

1 - e~ 7V 

1 + t~~< v 
2 

1 + e~^ v 


- 1 0 


1 0 


(3 25) 


(3 26) 


The equations (3 25) and (3 26) are used to obtain the output sequences for polynomial 
perceptron as well as for fractionally spaced polynomial perceptron 


3.5 Training Algorithms 

There are number of algorithms available to train the neural networks Here, we ha\e used 
the smoothed stochastic gradient algorithm to obtain the best suitable weight coefficeints 
for polynomial perceptron and fractionally spaced bilinear perceptron of networks. 

3.5.1 Algorithm for polynomial perceptron 

Step 1 Initially assign the small complex random values to all weight coefficients. 

Step 2. Present input vector and desired training response 
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Step 3. Calculate the actual complex output sequence y[t) b\ using the equations I 3 25) 

and (3.26) as given below 


m = f(Plv(X(n)j) (127) 

Step 4. Update the weight coefficients by using the smoothed stochastic gradient algo- 
rithm as follows 

A,(t + 1) = aX(t) + r]e(t)(l-f(t))\X(t) ( 3 28) 

u),(t + l) = w t (t) + \(t+ 1) (3 29) 

i = 1,2, ,n k ( 3 30) 

where, the constant n k is given by equation (3 4) The constants rj and a are used 
as a adaptive learning rate and momentum constant, respectnelv xl\ are the corre- 
sponding complex weight coefficients and A, are complex difference element used to 
modify the weight coefficients M, are the monomials of x(t ), x(t — 1 ), . , x(t—m+ 1 ) 

from degree-0 upto degree-/ 

e(t) = d(t)-y(t) (331) 

e(t) is the difference of desired training sample and the recovered output d(t) is 
the desired sample 

Step 5. Repeat by going to step 2 


3.5.2 Algorithm for fractionally spaced bilinear perceptron 


Step 1 . Initially assign the small complex random \alues to all weight coefficients 
Step 2. Present input vector and desired training response 

Step 3. Calculate the actual complex output sequence y(t) by using the equations ( 3 25) 
and (3 26) as given below 


/ m— 1 


i ){t) = / ( 52 ~ *) + 52 -j) + Y1Y1 - *)»(* - j) J (*-*-) 

\«=0 J=l 


m— 1 n — 1 


t=0 j=l 
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Step 4. Update the weight coefficients bv using the smoothed stochastic gradient algo- 


rithm as follows 

A a ,(f+1) = aA a ,(t) + // a c(<)(l - y 2 (tj) x(t - i) 13 33) 

Aa_,(t + 1) = aX 3 (t) + rj £ e(t) (l -y 2 (t)j y(t-j) (3.34) 

\ J ( t + l ) = + ViHt) (l -y 2 (t))x(t 35) 

a t (t+l) = a,(f) + AJt + l) (3 36) 

Cj(t + 1) = Cj(t) + \ Cj {t + 1) (3 37) 

K(t+ 1) = kAt) + \(t + l) (3 38) 

* = 0,1,2, , (m — 1) 

3 = 1,2,3, , (n — 1) 


where, the terms t }( ) and a are used as adaptive learning rate and momentum con- 
stant, respectively a„ c } and b tJ are the corresponding complex weight coefficients 
and A, are the complex difference elements used to modify the weight coefficients 

e(t) = d{t)-y{t) (3 39) 

e(t) is the difference of desired training sample and the recovered output d(t) is 
the desired sample 

Step 5. Repeat by going to step 2 

The use of smoothed stochastic gradient algorithm improves the performance but it 
requires more computation in each recursion Therefore some times it can be replaced by 
stochastic gradient algorithm 

The above algorithms are derived for two-dimensional input signal vector Therefore, 
the nodes are considered as complex neurons (as given m chapter 2) For one-dimensional 
input vector, same algorithm may be used by replacing the complex nodes by real nodes 
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Chapter 4 


Simulation results 


All of the simulations are performed on hp-850, convex C-220 akash and Dac-alpha ma- 
chines 


4.1 Back propagation algorithm 

In this section we will present the results obtained from the implementation of the back 
propagation algorithm 

The ability of back propagation algorithm is investigated using the classification prob- 
lem of 16 smwaves of different frequences 

Since, the network is used as a classifier, therefore, all desired output are set to zero 
except for that corresponding class of inputs The desired output is set to one The 
following structure of multilayer perceptron is selected for this exercise 

- Number of nodes in input layer = 128 
— Number of nodes in output layers = 16. 

— Number of hidden layer = 2 
— Number of nodes in first hidden layer = 64 
— Number of nodes in second hidden layer = 32 

The training (or learning) rate (r?) and momentum rate (a) are varied tn step of 0 l 
for 16 input samples of different sinewaves to obtain the number of iteration required for 
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restricting the global mean square error less than 0 03 The number of iterations required 
for differents combination of 17 and a are illustrated m fig 4 1 It is noted that, without 
momentum rate the convergence is very slow and network is not trained even m 300 
iterations for any value of eta We have observed the following salient features 

• The momentum rate plays an important role for the speed of convergence The 
convergence is faster if a small value of momentum rate is added 

• The number of iterations decrease as the value of rj increases 

• The convergence is more uniform with momentum 

• The number of iterations required for the convergence decreases with increasing a 
upto a particular limit, after that number of iterations increases for restricting the 
global error to the specified limit It is noted that if the a and the rj both are greater 
than 0 6, the convergence is slow 

In next step, the network is trained for these 16 smewaves of different frequencies, 
with rj = 0 6 and a = 0 3 The fig 4 2 illustrates the reduction in mean square error for 
first 50 iterations during the training of network It is found that the mean square error 
is reducing smoothly and exponentially, with selected set of rj and a The mean square 
error reached to less then 0 0004 within 50 iterations 

Then, we have shifted these smwaves and used them as the input for this trained 
network and found the corresponding outputs We have observed that this network is 
very powerful to identify these shifted smewaves of differents frequencies It is noted that 
this particular network correctly identify these patterns upto the shift of 48 degree in the 
sine waves, beyond 48 degree shift it fails to identify these patterns 
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4 2 Complex back propagation algorithm 

In this section we present the results obtained with the successful implementation of com 
piex back propagation algorithm as given m chapter 2 In order to test the performance 
of this algorithm the following four input complex patterns are used for classification 

— First input pattern [(1 1) (0 0) (0 0) (0 0)] 

— Second input pattern [(0 0) (1 1) (0 0) (0 0)] 

— Third input pattern [(0 0) (0 0) (1 1) (0 Oj] 

— Fourth input pattern [(0 0) (0 0) (0 0) (1 1)] 

The network is trained for these patterns as input with two different configuration ot 
multilayer perceptron 

• In first case we have considered the {4 2 4} structure l e 4 input nodes 4 output 
nodes and one hidden layer with 2 nodes 

• In second case we have considered the {4 8 4} structure i e 4 input nodes 1 
output nodes and one hidden lajer with 8 nodes 

In both cases the learning rate ( 7 ?) and momentum rate (a) are set to 0 6 and 0 I resptc 
tively 

The training results are obtained after 100 iterations 1000 iterations 2000 iterations 
and 4000 iterations for both structures {4 2 4} and {4 8 1} and illustrated in table l l 
and table 1 2 

It is observed that the complex neuron architecture given in chapter 2 is indeed capable 
of recognizing the complex signals It is to be noted that the selection of number of hidden 
layers and number of nodes in each hidden la>er as well as the learning parametus ne 
purely experimental 









{ 000 Iteration •> 


(1 1) 

(0 7589 0 7353) 

(0 9o9o 0 9873) (0 9728 0 9909) (0 9813 0 9934) 

(0 0) 

(0 0047 0 0029) 

(0 0188 0 0000) (0 01 34,0 0000) (0 0082 0 0000) 

(0 0) 

(0 1173 0 0627) 

(0 0491 0 0266) (0 0335,0 0186) (0 0233 0 0130) 

(0 0) 

(0 0757 0 8086) 

(0 0149 0 0170) (0 0118,0 0103) (0 0086 0 00b4) 

(0 0) 

(0 0017 0 0011) 

(0 0000 0 0004) (0 0000 0 0002) (0 0000 0 0002) 

(1 1) 

(0 8687 0 9832) 

(0 9704 0 9846) (0 9779,0 9893) (0 9846 0 9928) 

(0,0) 

(0 0917,0 0008) 

(0 0420 0 0033) (0 0292 0 0027) (0 0202 0 0020) 

(0,0) 

(0 1774 0 1317) 

(0 028d 0 0135) (0 0176,0 0088) (0 0121 0 0062) 

(0 0) 

(0 1092 0 2147) 

(0 0320 0 010’)) (0 0219 0 0075) (0 01j 1 0 00o4) 

(0 0) 

(0 002o, 0 0103) 

(0 0098 0 0272) (0 0092 0 01)6) (0 0063 0 0097) 

(1,1) 

(0 8904 0 9454) 

(0 9457 0 9764) (0 9625 0 9833) (0 9738 0 988 3) 

(0,0) 

(0 4091,0 0232) 

(0 0206,0 0000) (0 0125 0 0000) (0 0078 0 0000) 

(0,0) 

(0 2606,0 0890) 

(0 0290,0 0038) (0 0193,0 0064) (0 0133 0 0047) 

(0,0) 

■ (0 1575,0 0256) 

1 (0 0364 0 0081) (0 0226 0 00o9) (0 0154,0 0043) 

(0,0) 

' (0 0012 0 0453) 

1 

(0 0000,0 0005) (0 0000,0 0003) (0 0000 0 0002) 

(1 1) 

(0 4887,0 8969) 

(0 9599,0 9810) (0 9737 0 9874) (0 9820,0 9914) 


Table 1 1 Results for {4 2 4} multilayer perceptron using complex back propagatation 
algorithm 
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Target 

100 Iterations 

1000 Iterations 

2000 Iterations 

/ 000 Iterations 

(1 1) 


(0 9816 0 9938) 

(0 9874 0 9988) 

(0 9914 0 9972) 

(0,0) 

(0 0263 0 0028) 

(0 0078 0 0007; 

(0 0048 0 0006) 

(0 0031 0 0008) 

(0,0) 

(0 05i2 0 0278) 

(0 0125 0 0049) 

(0 0083 0 0034) 

(0 0086 0 0023) 

1 

(0 0) 

(0 0405 0 0093) 

(0 0130 0 0043) 

(0 0092 0 00 32) 

! (0 0065 0 0023) 

(0 0) 

(0 0155 0 0103) 

(0 009o 0 0048) 

(0 0061 0 0029) 

(0 0011 0 0019) 

(1 1) 

(0 9238 0 98)8) 

(0 9S02,0 9950) 

(0 9871 0 9971) 

(0 991 1 0 9982) 

(0 0) 

(0 0598 0 0018) 

(0 0136 0 0003) 

(0 0089 0 000 3) 

(0 0060 0 0002) 

(0,0) 

(0 0653 0 0153) 

(0 0188 0 0033) 

(0 0106 0 0022) 

(0 0073 0 0015) 

(0 0) 

(0 0453 0 0129) 

(0 0131 0 0036) 

(0 0092 0 0020) 

(0 0063 0 0011) 

(0 0) 

(0 0048 0 0295) 

(0 0122 0 0046) 

(0 0080 0 0022) 

(0 005 3 0 0011) 

(11) 

(0 9460 0 9728) 

(0 9828 0 9931) 

(0 9887 0 9957) 

(0 9924 0 9972) 

(0 0) 

(0 0417 0 0029) 

(0 0054 0 0010) 

(0 0033 0 0013) 

(0 0022 0 0018) 

(0,0) 

(0 0436,0 0132) 

(0 0094 0 0037) 

(0 0062 0 0029) 

(0 0043 0 0023) 

(0,0) 

(0 0708 0 0012) 

(0 0129 0 0005) 

(0 0083 0 0004) 

(0 0055 0 0004) 

(0 0) 

(0 0069 0 0494) 

(0 0091,0 0082) 

(0 0060 0 0047) 

(0 0041 0 0029) 

(1,1) 

(0 9244 0 9895) 

(0 9828 0 9964) 

(0 9884 0 9972) 

(0 9920 0 9979) 


Table 4 2 Results for {4 8 4} multilaj-er perceptrou using complex back propagation 
algorithm 



















































































4 3 Polynomial perceptron and fractionally spaced 
bilinear perceptron 

In this section we present the results obtained bv polynomial perceptron and fractionallv 
spaced bilinear perceptron (FSBLP) as given in chapter 3 The one dimensional and two 
dimensional structure of these algorithms are used for two different exercises 

The one dimensional structure of these algorithms are used to train the netwoiks 
for the EGG signal Here the direct sampled values of EGC signal are used as input 
sequences l e input vector is consider is one dimensional therefore the algonthms as 
given m chapter 3 are used by replacing the nodes from complex to real 

The recorded EC G signal used in our studies is the low Q signal The samples of 
this signal are taken at intervals of 5 msec l e signal is s impled at the rate of 200 
samples per seconds The total length of ECC signal is 4096 samples The amplitude of 
ECG signal is normalized between —1 to +1 because the used nonlinearity is hyperbolic 
tangent function which has the output range from —1 to +1 The reason for selectm 0 
this nonlinearity is the bipolar nature of ECG signal 

The following architectures of polynomial perceptron and fractionally spaced bilinear 
perceptron networks are selected to tram them with EGG signal 

• Polynomial perceptron 

— Number of feedforward taps = 8 
— Degree of polynomial = 1 
— Learning rate (r?) = 0 6 
— Momentum rate- (a) = 04 
— Slope parameter (7) = 1 

• Fractionally spaced bilinear perceptron 

— Number of feedforward taps = 8 
- Number of feedback taps = 4 



- Learning rate (77 ) = 0 6 

- Learning rate (77^) = 0 6 
— Learning rate (17 ) = 0 6 

- Momentum rate (a) = 0 4 
— Slope parameter ( j) = 1 

Both of the above networks are trained with a small segment of fust 200 samples ol 
EGG signals using specified learning rate and momentum rate 

The next segment of bOO samples after first 200 samples is used as the input st quem es 
for the trained networks The results are plotted m fig 4 3 These results show that the 
output signals in. both networks are ver\ close to the input signals Fig 4 3(d) illustrates 
the relationship between global mean square error and number of training times for PP 
and FSBLP It shows that mean square error can be restucted to — 40d6 by using the 
FSBLP with a = 0 4 

In next step, 15 db of random noise is added to EGG signal and trained the FSBLP 
network with a segment of first 1200 samples of this noisy EGG signal The learning 
parameters are selected as 77 = 0 2 and a = 0 01 The number of feedforward and 
feedback taps are selected as 20 and 10 respectively Then another segment of 400 
samples is selected as the input sequence to this trained network The fig 4 1 illustrates 
the resultant output of this network The recovered signal shows that the network < uuld 
not retrive the P wave of EGG signal correctly 
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FiO 4-4 (a) Original ECG signal to) Noisy ECG signal 
(c> Recovered ECG signal 
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In next step we have investigated the behaviour of complex structure of poh normal 
perceptron and fractionally spaced bilinear perceptron The algorithms are implemented 
as given in chapter 3 and trained the networks to suppress the noise from 16 level quadra 
ture amplitude modulation system 

Quadrature amplitude modulation (QAM) Digital communication uses onlv 
a finite number of symbols for communication The minimum number of symbols are 
two in case of binary transmission When communication uses \1 symbols of different 
amplitude it is known as \I arv QAM communication This multi amplitude signaling 
allows us to transmit each group of log 2 M binary digits bj one M ary pulse I he bloik 
diagram of MQAM digital communication system is shown in fig 4 i We have considered 
one practical case with M=16 l e the following 16 pulses are used for 16 different symbols 

p (t) = a p(t)cosw t + b p(t)siriiL c t (4 1 ) 

where, p (i) is the transmitted signal p(t) is a properly shaped baseband pulse, u. is the 
carrier angular frequency and [a (t) + jb (f)] is the equivalent complex baseband MQAM 
signal The choices of a and b for 16 pulses are shown in fig 4 6(a) Since M=16 
therefore each pulse transmits the information of log 2 16 = 4 binary digits All the 16 
different combination of (a, b ) are illustrated in fig 4 6(a) Thus every possible 4 bit 
sequence is transmitted by a particular (a b ) Therefore one single pulse as represented 
by equation (4 1), will transmit the four bit information 
The received signal r(t) may be written as follows 

r(t) = Re{[x t (t)+jy(t)]e>* ( 12 ) 

where, [x (i) + jy (f)j is the equivalent complex baseband signal of r(t) This complex 
baseband signal may be written as follows 

x{t) = x^ + jy^t) 

+ (a (i) + b t {t)}R 2 e^ +w r > 

+ K(0+2 n i>(01 ( l} ) 
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where [i2 1 e ,fll W +R 2 e : ^ 3 ^ +W D] is two ray complex baseband model of frequency selective 
fading process on the transmitted signal and [n ( t ) + jn 0 (t j] is the complex baseband 
gaussian noise r is the relative delay between main and reflected rays Ri(t) s, i=l 2 are 
Rayleigh processes R ( t ) s and 9 ( t ) s are assumed to be statistically independent 

It is assumed that the symbol recovery time is ideal x(t) is the sample of received 
complex baseband signal Fig 4 6(b) illustrates the noisy received samples These 
received samples are passed through the channel equalizer (neural network) before being 
mapped to the estimated signal vector 

The following architecture of polynomial perceptron and fractionally spaced bilinear 
perceptron are used for the noise suppression from 16 QAM signals 

• Polynomial perceptron 

— Number of feedforward taps = 20 
— Degree of polynomial = 4 
— Learning rate (77) = 0 0008 
— Momentum rate (a) = 0 001 
— Slope parameter (7) = 1 

• Fractionally spaced bilinear perceptron 

— Number of feedforward taps = 20 
— Number of feedback taps = 10 
— Learning rate ( rj a ) = 0 01 
- Learning rate (774) = 0 01 
— Learning rate (tj c ) = 0 001 
— Momentum rate (a) = 0 001 
— Slope parameter (7) = 1 

In both cases, the algorithms are used to train the networks with 5000 input samples 
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The input samples are the noisv 16 level QAM signal where the random noise is 
added m such a manner that the samples can overlap the nearhv patterns 

In these algorithms the learning rate and the momentum rates are considerahh small 
because the number of training samples and the number of weight coefficients in polv 
nomial perceptron is too large The number of weight coefficients for above structure of 
polynomial perceptron is 1062u which is \er\ large as compared to 209 for FSBLP then 
fore the learning rate and momentum rate for polynomial perceptron is chosen mu<h less 
as compared to FSBLP 

Initially, the networks are trained with these )000 noisv samples litre the 1101 e is 
added such that the samples may overlap upto 12 percent of nearbv patterns and restarted 
the signal to noise ratio as low as 1 ■jdb 

Another 1 jOOO of noay input samples with same noise level are used as input se 
quenres for these trained network to find out the corresponding output sequence-! Fig 
4 6(a) illustrates the ideal 16 QAM signal constellation where each pattern is represented 
by name Pi through Pl6 Fig 4 6(b) is illustrating the 15000 noisy samples These 
noisy received samples are used as the input for both networks The output results are 
illustrated m fig 4 6(c) and fig 4 6(d) 

The output results of polynomial perceptron show that all of the signals belonging to 
patterns PI, P3, Pll Pl2 P13 Pli and P16 are mapped correctly but few of the signals 
from other patterns are mapped wrongly 

The output results of fractionally spaced bilinear perceptron show that all signals 
belonging to patterns Pi P2 P6 P7 P10 Pll P12 P13 Plo and P16 are mapped 
correctly only very few of the signals from patterns P3 P4 Po P8 P9 and Pll are 
mapped wrongly 

These results show that 98 percent of correct samples are recovered from 12 pi rcent of 
overlapped noisy signals by using the fractionally spaced bilinear perceptron with least 
probability of error 
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In next step FSBLP algorithm is used to suppress the noise and other channel inter 
ference from EGG signals Here the 16 level quadrature amplitude modulation svstem is 
used for adaptive equalization of random noise 

The EGG signals are quantized into 12 bit convert each 12 bit sequence into 3 se 
quences of 4 bit each and send each 4 bits sequence sequentially m regular time interval 

The I'idb of random noise is again added to these 16 level QA.M signal of FC C data 
The FSBLP network is then tiained with a segment of first 1300 of these EC C data The 
following architecture of FSBLP is considered for this exercise 

— Number of feedforward taps = 30 
— Number of feedback taps = lo 
— Learning rate (q ) = 0 01 
— Learning rate (r/&) = 0 01 
— Learning rate (rj ) — 0 001 
— Momentum rate (a) = 0 00a 
— Slope parameter ( 7 ) = 0 2 

In EGG signal after P Q delay the pacemaker pulse causes a sharp ventricular contrac 
tion (the QRS complex), therefore it becomes difficult to tram the network with nonlinear 
sigmoidal function having the unity slope 

It is observed that if the slope of sigmoidal function is reduced to 0 2 the network is 
trained with the desired accuracy 

After completing the training another segment of next 4800 samples of these noisv 
signals is used as the input for this trained network Then using the simple thresholding 
the output sequences of FSBLP are mapped to their corresponding pattern EC C signal 
is reconstructed back from the threshold output sequences Fig 4 t is illustrating the 
original EGG signal and the recovered EGG signal after suppression of the 1 ~>db of random 
channel noise 

The recovered signal shows that the network could not retrive the first QRS complex, 
but after that rest of the part of signal is nicely recovered 
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Fig 4 7 (a) Original ECG signal (b) Recovered EGG signal from 15db noise 






Chapter 5 


Conclusions 


By viewing the equalization pioblem m long distance medical telemetrv as a classification 
problems the influence of channel nonlmeanties and additive noise to the optimal solu 
tion have been investigated It has been shown that the neural network approach offers 
equal effectiveness for adaptively equalizing the noise The backpropagation and < um 
plex back propagation algorithms are anahzed The complex neuron based poly normal 
perceptron and fractionally spaced polynomial perceptron algorithms are put forward For 
16 level quadrature amplitude modulation system to recover the ECG stgmls from noisv 
environment Both of these algorithms are applied for 16 QAM system 

Computer simulations have been carried out and results are compared Preluninarv 
results obtained are satisfactory therefore following conclusions can be drawn 

• The practical difficulties associated with back propagation and complex back piopa 
gation algorithms are the selection of number of hidden lavers and number of nodes 
in each hidden layer The selection of learning parameters are experimental The 
rate of convergence can be increased by adding the momentum term m algorithms 
but simultaneously it increases the computational complexity 

• The polynomial perceptron and fractionally spaced polynomial perceptron strut 
tuied with nonlinear sigmoid type tangent hyperbolic activation function are ca 
pable of approximating the optimal solution The slope parameter of activation 
function play an important role in convergence Practically, it is found that the 
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slope parameter of 0 2 can tram the network with the desired accuracy 

• The complexity of the polynomial perceptron algorithm is determined by the two 
structure parameter, namely the order and degree of polynomial degree Practical 
selection of order and degree of polymomial has been discussed The polynomial 
perceptron with 4 th degree is sufficient to approximate signals within the specified 
accuracy 

• The structure of fractionally spaced bilinear is simple It possesses the fast con 
vergence rate and less computational complexity than polynomial perceptron The 
performance of fractionally spaced bilinear perceptron is better than that of poly 
normal perceptron 

• The fractionally spaced bilinear perceptron network recovered the 16 level QAM 
signals with less probability of error 

• The ECG signal is retrived satisfactorily with fractionally spaced bilinear percep 
tron The network feels the difficulty only to retnve the first QRS complex, because 
m this technique, the weight updated is recurrent i e, weights are also updated 
during output generation During output generation, desired output response is 
substituted by its estimated value and the algorithm can continuously be employed 
to track a time varying environment 
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